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We combined the genetic crossover, which is one of the operations of genetic algorithm, and replica-exchange 
method in parallel molecular dynamics simulations. The genetic crossover and replica-exchange method can 
search the global conformational space by exchanging the corresponding parts between a pair of conformations 
of a protein. In this study, we applied this method to an a-helical protein, Trp-cage mini protein, which has 
20 amino-acid residues. The conformations obtained from the simulations are in good agreement with the 
experimental results. 


I. INTRODUCTION 


The search of the stable states of biomolecules, such 
as DNA and proteins, by molecular simulations is impor¬ 
tant to understand their functions and stabilities. How¬ 
ever, as the biomolecules have a lot of local minimum- 
energy states separated by high energy barriers, con¬ 
ventional molecular dynamics (MD) and Monte Carlo 
(MC) simulations tend to get trapped in states of lo¬ 
cal minima. To overcome this difficulty, various sam¬ 
pling and optimization methods for conformations of 
biomolecules have been proposed such as generalized- 
ensemble algorithms^ which include the multicanonical 
algorithm (MUCA)^!^, simulated tempering (ST)^!^ and 
replica-exchange method (REM)^ii. We have also pro¬ 
posed a conformational search method referred to as 
the parallel simulated annealing using genetic crossover 
(PSA/GAc}^“— , which is a hybrid algorithm combining 
both simulated annealing (SA)i^ and genetic algorithm 
(GA}^iii. In this method, parallel simulated annealing 
simulations are combined with genetic crossover, which 
is one of the operations of genetic algorithm. Moreover, 
we proposed a method that combines parallel MD simu¬ 
lations and genetic crossover with Metropolis criteriooi^. 

In this study, we applied this latest conformational 
search methodic using the genetic crossover to Trp-cage 
mini protein, which has 20 residues. The operation of 
the genetic crossover is combined with the conventional 
MD and REM. The obtained conformations during the 
simulation are in good agreement with the experimental 
results. This article is organized as follows. In Section 2 
we explain the present methods. In Section 3 we present 
the results. Section 4 is devoted to conclusions. 


II. METHODS 

A. Parallel molecular dynamics using genetic crossover 

We briefly describe our methodic. We first prepare M 
initial conformations of the system in study, where M 
is the total number of “individuals” in genetic algorithm 
and is usually taken to be an even integer. We then 
alternately perform the following two steps: 

1. For the M individuals, regular canonical MG or 
MD simulations at a fixed temperature T are car¬ 
ried out simulataneously and independently for a 
certain MC or MD steps. 

2. M/2 pairs of conformations are selected from 
“parental” group randomly, and the crossover and 
selection operations are performed. Here, the 
parental group means the latest conformations ob¬ 
tained in Step I. 

If we employ MC simulations in Step I above, we 
can refer the method to as parallel Monte Carlo us¬ 
ing genetic crossover (PMC/GAc) and if MD simula¬ 
tions, parallel molecular dynamics using genetic crossover 
(PMD/GAc). In Step 2, we can employ various kinds 
of genetic crossover operations. Here, we just present a 
case of the two-point crossover (see RefAi). The follow¬ 
ing procedure is carried out (see Fig. [1]) : 


1. Consecutive amino acids of length n residues in 
the amino-acid sequence of the conformation are 
selected randomly for each pair of selected confor¬ 
mations. 
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FIG. 1. Schematic process of the two-point crossover oper¬ 
ation. In this process, all dihedral angles (in backbone and 
side chains) within the randomly selected n consecutive amino 
acids are exchanged between a pair of conformations. 


2. Dihedral angles (in only backbone or all dihedral 
angles) in the selected n amino acids are exchanged 
between the selected pair of conformations. 

Note that the length n of consecutive amino-acid residues 
can, in general, be different for each pair of selected con¬ 
formations. 

We need to deal with the produced “child” conforma¬ 
tions with care. Because the produced conformations of¬ 
ten have unnatural structures by the crossover operation, 
they have high potential energy and are unstable. There¬ 
fore, a relaxation process is introduced before the selec¬ 
tion operation. Short simulations at the same tempera¬ 
ture T with restraints on the backbone dihedral angles of 
only the n amino acids are performed so that the corre¬ 
sponding backbone structures of the n amino acids will 
approach the exchanged backbone conformation. The 
initial conformations for these equilibration simulations 
are the ones before the exchanges. Namely, by these equi¬ 
libration simulations, the corresponding backbone con¬ 
formations of the n amino acids gradually transform from 
the ones before the exchanges to the ones after the ex¬ 
changes. We then perform short equilibration simula¬ 
tions without the restraints. We select the last confor¬ 
mations in the equilibratoin simulations as “child” con¬ 
formations. 

In the final stage in Step 2, the selection operation is 
performed. We select a superior “chromosome” (confor¬ 
mation) from the parent-child pair. For this selection 
operation, we employ Metropolis criterion!^, which se¬ 
lects the new child conformation from the parent with 
the following probability: 

i(;(p c) = min(l,exp{-/3[F;c - ^^p]}), (1) 

where Ep and stand for the potential energy of the 
parental conformation and the child conformation of the 
parent-child pair, respectively. /3 is the inverse tempera¬ 
ture, which is defined by /3 = 1 /ksT is the Boltzmann 
constant). 


B. Combination with replica-exchange method 

The sampling method using genetic crossover in the 
previous subsection can be easily combined with other 
sampling methods such as generalized-ensemble algo¬ 
rithms. Firstly, the conventional MC or MD in Step 1 
above can be replaced by other sampling methods such 
as MUCA and ST. Secondly, the above method can be 
combined with REM in Step 2 above. 

As an example, we introduce a method that combines 
genetic crossover and REM. We first prepare M initial 
conformations of the system in study, where M is the 
total number of “individuals” (in genetic algorithm) or 
replicas (in REM) and is usually taken to be an even in¬ 
teger. While only one temperature value was used in the 
previous method, we prepare M different temperature 
values (Ti, • • • ,Tm) here. Without loss of generality, we 
can assume that Ti < ••• < Tm- We then alternately 
perform the following two steps: 

1. For the M individuals, regular canonical MC or 
MD simulations at the fixed temperature Tm {m = 
1, • • •, M) are carried out simulataneously and in¬ 
dependently for a certain MC or MD steps. 

2. M/2 pairs of conformations at neighboring temper¬ 
atures are selected from “parental” group, and one 
of the following two operations is performed. 

(a) Two-point genetic crossover is performed for 
each pair of parents to produce tow children, 
and new child conformations are accepted 
with the probability in Eq. (HD. 

(b) each pair of replicas i and j (with coordi¬ 
nates and corresponding to neighbor¬ 
ing temperatures Tm and Tm+i, respectively, 
is exchanged with the following probability: 

Yt j) = min (1, exp{—A}), (2) 

where 

A = (/3^ - Pm+iKE{q^^) - E{q^^)). (3) 

Here, Pm is the inverse temperature {Pm = 
l/kBTm) and ^(gW) is the potential energy 
of replica i before replica exchange. If MD is 
employed in Step 1, we also have to rescale 
momenta after replica exchange^. 

If we employ MC simulations in Step 1 above, we can 
refer the method to as replica-exchange Monte Carlo us¬ 
ing genetic crossover (REMC/GAc) and if MD simula¬ 
tions, replica-exchange molecular dynamics using genetic 
crossover (REMD/GAc). In the above formulation, we 
chose pairs of the parent individuals (replicas) that cor¬ 
respond to neighboring temperatures. This is to make 
the acceptance of replica exchange high. Hence, as far as 
the crossover operations are concerned, we could select 
pairs of parents randomly. 
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III. RESULTS AND DISCUSSION 


We applied the present methods, namely, PMD/GAc 
and REMD/GAc, to Trp-cage. Trp-cage is known to be 
one of the smallest protein-like model systems and has 
20 amino-acid residues. This mini protein was studied 
experimentally by NMR measurements at 282 K (PDB 
ID: lL2Y)iS. 

We incorporated our genetic crossover sampling meth¬ 
ods by modifying the TINKER program packageii. The 
unit time step was set to 1.0 fs. Each simulation for sam¬ 
pling was carried out for 10.0 nsec (hence, it consisted 
of 10,000,000 MD steps) with 16 individuals (M = 16). 
Namely, the total simulation time for sampling was 160.0 
nsec. We performed 200 crossover operations, which 
selected consecutive amino-acid residues of length be¬ 
tween 2 to 10, during the simulationsii. The temper¬ 
ature during MD simulations was controlled by Nose- 
Hoover method^. The temperature was set at 282 K for 
the PMD/GAc simulation (the same as the experimen¬ 
tal condition). For REMD/GAc, the number of replicas 
were also set to M = 16. The temperatures were dis¬ 
tributed exponentially: 650, 612, 577, 544, 512, 483, 455, 
428, 404, 380, 358, 338, 318, 300, 282, and 266 K. During 
the REMD/GAc simulation, we performed 100 genetic 
crossover operations and 10,000 replica-exchange opera¬ 
tions. As for the conformational potential energy calcu¬ 
lations, we used the AMBER ff99SB force fieldiS. As for 
solvent effects, we used the GB/SA mode P°'^^ included 
in the TINKER program packageii. In order to test the 
effectiveness of the present method more quantitatively, 
we have to include more rigorous solvation models, which 
we will do in a future work. The individuals (replicas) for 
the simulations had different sets of randomly generated 
initial velocities. We also performed conventional MD 
and REM simulations for comparisons. The simulation 
conditions were the same as above except the crossover 
and selection operations. In order to balance the com¬ 
putational cost, we performed independent 16 simulation 
runs of 10.0 nsec in length for the conventional MD. 

In Fig. we compare the structure of PDB (1L2Y 
modell) and the lowest-RMSD conformations obtained 
from the conventional MD simulation and the PMD/GAc 
simulation. The room-mean-square-distance (RMSD) 
values of C“ atoms with respect to the native structure 
for the conventional MD simulation and the PMD/GAc 
simulation are 4.06 A and 1.78 A, respectively. Ob¬ 
viously, the conformation obtained from PMD/GAc is 
more similar to the native structure than that from the 
conventional MD. Moreover, we also compared the struc¬ 
tures with a native-like structure, which was the lowest- 
energy conformation obtained from iso-thermal canoni¬ 
cal simulations at 282 K. This native-like conformation 
was obtained from 16 canonical simulations of 2.0 nsec 
with different sets of randomly generated initial veloci¬ 
ties. The RMSD values with respect to the native-like 
structure for the conventional MD simulation and the 
PMD/GAc simulation are 3.48 A and 1.62 A, respec- 



FIG. 2. The lowest-RMSD structures obtained from the con¬ 
ventional MD (Canonical), PMD/GAc (GC), conventional 
REMD (REM), and REMD/GAc (GG/REM). (a) is the 
lowest-RMSD structures with respect to the experimental re¬ 
sult (Native(PDB), PDB ID:1L2Y modell), and (b) is the 
lowest-RMSD structures with respect to the lowest-energy 
conformation obtained from the conventional iso-thermal 
canonical simulations at 282 K (Native-Like(282K)). 


tively. The conformation obtained from PMD/GAc is in 
better agreement with the native-like structure than that 
from the conventional MD. 

In Fig. [3l the probability distributions of RMSD of all 
conformations obtained from the conventional MD simu¬ 
lation and the PMD/GAc simulation are shown. RMSD 
values obtained from PMD/GAc are lower than those 
of the conventional MD as a whole. The averages of 
RMSD values obtained from the conventional MD and 
PMD/GAc are 7.06 A and 5.50 A, respectively. Hence, 
PMD/GAc can search the conformational space around 
the native structure efficiently in comparison with the 
conventional MD. 

We now examine the results of REMD/GAc simula¬ 
tion. In Fig. [21 we compare the lowest-RMSD confor¬ 
mations obtained from the conventional REMD simu¬ 
lation and REMD/GAc simulation. The RMSD val¬ 
ues with respect to the PDB structure for the conven¬ 
tional REMD simulation and the REMD/GAc simula¬ 
tion are 2.03 A and 1.93 A, respectively. Hence, these 
conformations are almost the same and similar to the 
PDB structure. Moreover, the RMSD values with re¬ 
spect to the native-like structure for the conventional 
REMD simulation and the REMD/GAc simulation are 
1.78 A and 1.27 A, respectively. The conformation ob¬ 
tained from REMD/GAc is in slightly better agreement 
with the native-like structure than that from the conven¬ 
tional REMD. 

In Fig. |3l the probability distributions of RMSD of 
all conformations obtained from the conventional REMD 
simulation and the REMD/GAc simulation are shown. 
The obtained ranges of the RMSD values of both conven¬ 
tional REMD and REMD/GAc simulations are broad. 
The averages of RMSD values obtained from the con¬ 
ventional REMD simulation and the REMD/GAc sim¬ 
ulations are 6.32 A and 6.58 A, respectively. Hence, 
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there is almost no difference of the two average values. 
However, there are some differences in the distribution of 
conventional REMD and REMD/GAc simulations. The 
peak values of the probability distributions of the conven¬ 
tional REMD simulation and the REMD / GAc simulation 
are 7.28 A and 6.68 A, respectively. Moreover, there is 
another small peak around 3.26 A for the conventional 
REMD simulation. On the other hand, there are not 
any peaks except for the highest peak in the case of the 
REMD/GAc simulation. These results suggest that the 
REMD/GAc simulation did not get trapped in any lo¬ 
cal minima in comparison with the conventional REMD 
simulation. 

In order to further examine the sampling efficiency of 
the conventional REMD simulation and the REMD/GAc 
simulation, we counted the number of tunneling events. 
A tunneling event means a random walk from the lowest- 
energy region to the highest-energy region and back, and 
is observed when a system goes from an energy minimum 
to another minimum via a high-energy region^iS^. If the 
number of the tunneling events is large, the conforma¬ 
tional sampling is considered to be more efhcient^i^. The 
numbers of the tunneling events obtained from the con¬ 
ventional REMD simulation and the REMD/GAc simu¬ 
lation are 54 and 107, respectively (these numbers are the 
total for all the 16 replicas). We see that REMD/GAc 
can perform more efficient conformational search by us¬ 
ing the crossover operation. Here, the average of the 
acceptance ratio of crossover operations was 0.26. How¬ 
ever, the ratio must depend on the system and the length 
n of the crossover operations. 


IV. CONCLUSIONS 

In this work, we introduced two conformational sam¬ 
pling methods based on genetic crossover and applied 
them to a mini protein, Trp-cage. One method is a com¬ 
bination of conventional molecular dynamics and genetic 
crossover, and the other is a further combination with 
the replica-exchange method. These methods realize a 
broader conformational search by the genetic crossover, 
which is based on global conformational updates. Gon- 
formations close to the native structure were successfully 
obtained by these methods. 

The genetic crossover sampling methods have a big ad¬ 
vantage of being highly parallelizable on parallel comput¬ 
ers. In the future, we are going to apply these methods 
to various large proteins in explicit solvent. 
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